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State distillation is the process of taking a number of imperfect copies of a particular quantum 
state and producing fewer better copies. Until recently, the lowest overhead method of distilling 
states \A) = (|0) + ' ^ \\)) j produced a single improved \A) state given 15 input copies. New 
block code state distillation methods can produce k improved |j4) states given 3fe + 8 input copies, 
potentially significantly reducing the overhead associated with state distillation. We construct an 
explicit surface code implementation of block code state distillation and quantitatively compare 
the overhead of this approach to the old. We find that, using the best available techniques, for 
parameters of practical interest, block code state distillation does not always lead to lower overhead, 
and, when it does, the overhead reduction is typically less than a factor of three. 



One of the grand challenges of 21st-century physics and 
engineering is to construct a practical large-scale quan- 
tum computer. One of the primary ways theoretical re- 
search can reduce the magnitude of this challenge is to 
devise ways of performing a given quantum computation 
using fewer qubits and quantum gates while simultane- 
ously leaving all other engineering targets unchanged. 

State distillation [3 [2] is a procedure required by 
the majority of concatenated quantum error correction 
(QEC) schemes [MZ], with the exception of the Steane 
code [S], and required by the majority of topological QEC 
schemes [SlfT^. with the exception of a 3-D color code 
[20] and a non-Abelian code [21]. As such, the search for 
lower overhead methods of implementing state distilla- 
tion is of great importance. 

Two recent works [211 123] are of particular note, both 
independently proposing block code based methods tak- 
ing 3fc -I- 8 imperfect copies of a particular state and dis- 
tilling fc improved copies. However, a detailed analysis 
of the overhead in terms of qubits and quantum gates 
was not performed. In this work, we explicitly construct 
a surface code [W implementation of one of these block 
code state distillation methods [23j. The surface code is 
believed [23] to be the lowest overhead code that will ever 
exist for a quantum computer consisting of a 2-D array 
of qubits with nearest neighbor interactions [25H28] . Fur- 
thermore, this code can be used to achieve time-optimal 
quantum computation [29j . The surface code therefore 
provides an excellent framework to gauge the cost of the 
new block code state distillation methods. 



The discussion shall be organized as follows. In Sec- 
tion |T] the quantum circuit used to perform block code 
state distillation is presented. In Section [ll| we perform 
a detailed comparison of the overhead of concatenated 
15-1 and block code state distillation. In Section jlllj we 
summarize our results and discuss further work. 



I. BLOCK CODE STATE DISTILLATION 

The state we are interested in distilling is |A) = 
(|0) -I- e"/'* |l))/\/2. An extendable quantum circuit tak- 
ing 3fc -t- 8 copies of \ each with probability p of error, 
and producing k copies, each with probability approxi- 
mately (3fc -|- l)p^ of error [23], is shown in Figs. ^]j2 T 
gate application is delayed using the circuit of Fig. 2^. 
This circuit has the additional advantage of eliminating 
X errors from the T gate, leaving us only needing to de- 
tect Z errors. Each T gate consumes one \A) state as 
shown in Fig. [2|3. All output states are discarded if any 
errors are detected. Fig. [l] has been designed to detect 
a Z error during any single T gate. All other quantum 
gates are assumed to be perfect, or at least sufficiently 
reliable that the probability of error from gate failure 
is negligible compared to the probability of error from 
multiple T gate errors. The first order probability that 
the outputs will be rejected is therefore approximately 
(3fc -|- 8)p, with this expression approximate due to the 
ability of Fig. [2]3 to introduce S errors and the ability 
of Fig. [2^ to filter out everything except Z errors. First 
order expressions are appropriate as we restrict ourselves 
to (3fc-f 8)p< 1. 

For k = 2 + 4j, the block code has the property that 
transversal S^X implements logical SX on each encoded 
logical qubit. Each logical qubit is prepared in \A), and 
hence in the absence of errors the multiple \A) block code 
will be in the -1-1 eigenstate of transversal S^X = XT. 
The top qubit of Fig. [Tjshould therefore report -1-1, with 
all output discarded if -1 is reported. This single mea- 
surement is sufficient to detect a single Z error during 
the first two layers of T gates. 

The block code has four stabilizers, specif- 
ically ^'0X2X3 . . . Xk+2, ^"1X2 . . . Xk + lXk+3, 

Z0Z2Z3 . . . Zfc+2, and Z1Z2 ■ ■ ■ Zk+iZk+z- Detecting a 
Z error in the final layer of T gates involves using the 
stabilizers ^0X2X3 . . . Xk+2 and X1X2 ■ ■ ■ Xk+iXk+3- 
For arbitrary encoded logical states, in the absence of 
errors, the block code will be in the -1-1 eigenstate of 
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FIG. 1: Extendable quantum circuit taking 3k + 8 copies of 
1^), eacii with probability p of error, and producing k copies, 
each with approximate probability (3fc + l)p^ of error. In the 
figure, A; = 4. The repeating unit cell is highlighted. Note 
that k must be even. A box encircles output numbers. Each 
T gate consumes one \ A) state as shown in Fig. [5] 
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FIG. 2: a.) Circuit useful for delaying the application of T 
and eliminating X errors, b.) Circuit implementing a T gate 



using an ancilla state \A) — (|0) + i 
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FIG. 3: Constant depth extendable circuit implementing 
3fc + 8 to fc state distillation for A; = 4. Boxes encircle output 
numbers. Using the surface code, bent CNOTs can be imple- 
mented exactly as shown (see Fig.[5|. The repeating unit cell 
is highlighted. 



these stabilizers. If the products of the individual X 
basis measurements comprising these stabilizers are not 
both +1, all output is discarded. 

Assuming the above three checks are passed, all output 
is accepted, with byproduct Z operators noted as follows. 
For each encoded logical qubit < n < k, the associated 
logical X operator takes the form X„+2-'^fe+2-'^fc+3- If 
the product of these measurements is -1, a byproduct Z 
is associated with output n. 

Fig. [3] shows a rearranged version of Fig. [1] that IS more 



convenient for physical implementation. A surface code 
CNOT is shown in Fig. |4] 1121 [13l US] . This topological 
structure can be arbitrarily deformed without changing 
the computation it implements. This permits direct im- 
plementation of the bent CNOTs (Fig. [5| . This can be 
compressed to Fig. |6] See Appendix [A] for a step-by-step 
description of the compression process and larger versions 
of these figures. 
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FIG. 4: a.) CNOT quantum circuit example, b.) Equivalent 
surface code CNOT [121 [131 [19] . Time runs from left to right. 
The scale of the figure is set by the code distance d. Small 
cubes are d/4 a side. Longer blocks have length d. Each unit 
of d in the temporal direction represents a round of error de- 
tection. Each unit of d in the two spatial directions represents 
two qubits. The structures are called defects, and represent 
space-time regions in which error detection has been turned 
off. 






FIG. 5: Depth 31 canonical surface code implementation of 
Fig. [3] A lar ger version of this figure can be found in Ap- 
pendix |X] 



II. OVERHEAD COMPARISON 




FIG. 6: Depth 12 compressed surface code implementation 
of Fig. ID A lar ger version of this figure can be found in 
Appendix jXj along with step-by-step images explaining how 
it was obtained. 



Suppose we desire logical \A) states with error pout 
and can prepare logical |^) states with error pi^. We will 
consider values pin = 10^^, 10^'^, and 10~^, as this cov- 
ers the currently believable physically achievable range, 
and values Pout — 10~^, . . ., 10~^°, as this covers essen- 
tially the entire range that could believably be useful in 
a practical quantum algorithm. 

The process of preparing arbitrary logical states is 
called state injection, and in the surface code approxi- 
mately 10 gates are required to work before error protec- 
tion is available [Ej . It is therefore reasonable to assume 
the physical gate error rate Pg is an order of magnitude 
less than pi^. The logical error rate per round of error 



detection in a square patch of surface code as a function 
of Pg and code distance d is shown in Fig. [?[ [50], 

Focusing initially on the simpler 15-1 concatenated dis- 
tillation process, the topological structure required for a 
single level of distillation is shown in Fig. [Sj Dark struc- 
tures are called dual defects, light structures are called 
primal defects. The geometric volume of the structure 
can be defined as the number of primal cubes in a min- 
imum volume cuboid containing the structure. In this 
case, the structure is 6 cubes high, 16 cubes wide, and 2 
cubes deep, for a total V = 192. Each primal cube has di- 
mensions (i/4, each longer prism has length d. Each unit 
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FIG. 7: Probability pl of logical X error per round of surface 
code error correction for various code distances d and physical 
gate error rates pg . The asymptotic curves (dashed lines) are 
quadratic, cubic, quartic for distances d = 3, 5, 7 respectively. 




FIG. 8: State distillation method taking 15 input \A) states, 
each with error p, and producing with probability 1 — 15p a 
single output \A) state with error 35p^. Each unit of d in 
the temporal direction (up in this figure) corresponds to a 
round of surface code error detection, each unit of d in the 
two spatial directions corresponds to two qubits. 



of d in the temporal direction (up in Fig. |8]) corresponds 
to a round of surface code error detection, each unit of d 
in the two spatial directions corresponds to two qubits. 
It is therefore straightforward to convert the geometric 
volume to an absolute volume in units of qubits-rounds. 
A fragment of the complete structure of edge length 5d/4 
with a primal cube potentially centered within it is called 
a plumbing piece. Geometric volume is therefore in units 
of plumbing pieces. In order to calculate the overhead of 
state distillation, we will need to first reasonably upper 
bound the probability of logical error per plumbing piece. 

Consider a forest of straight, d separated parallel de- 
fects of circumference d, as shown in Fig. [9) Each defect 
can be assumed responsible for logical errors connecting 
it to two of its neighboring defects and also self encircling 



FIG. 9: A forest of d separated straight defects of circumfer- 
ence d. Two square surfaces of dimension dx d have been in- 
cluded. The logical error rate of these surfaces upper bounds 
the probability of a logical error connecting neighboring de- 
fects and encircling a single defect. 



logical errors. The probability of each of these types of 
logical error per round of error detection can be upper 
bounded by the probability of logical error per round 
of error detection of a square surface. There are more 
potential logical errors per round connecting opposing 
boundaries in a square surface of distance d than there 
care connecting distinct defects or encircling a single de- 
fect. 

Given the per round probability of logical error 
PLid,Pg) of a square surface, we can upper bound the 
logical error rate of a plumbing piece PL{d,Pg) by 2 x 
3 X 5d/4 X pL{d,pg), where the factor of 5c?/4 is for the 
number of rounds of error detection in a plumbing piece, 
the factor of 3 is for the number of distinct classes of 
logical error, and the factor of 2 is due to the fact that 
a single plumbing piece can contain both a primal and a 
dual defect. From Fig. [t) PL{d,Pg) ~ 0.1(100pg)('*+i)/2^ 
implying P^d^pg) ~ ^(lOOpjt'^+i)/^. 

Given input error rate Pin, with 15-1 state distillation 
the output error rate can be made arbitrarily close to 
Pdist — 35p'^ by using a sufficiently large d to eliminate 
logical errors during distillation. However, logical errors 
do not need to be completely eliminated, and we define 
epdist to be the amount of logical error introduced. For 
e = 1, the logical circuitry introduces as much error as 
distillation fails to eliminate, and Pont = (1 + e)Pdist- We 
shall assume that logical failure anywhere during distil- 
lation leads to the output being incorrect and accepted. 

Let us consider a specific example. Suppose pin = 
10^^, our desired Pout = 10~^^, and our chosen e = 1. 
Our top level of state distillation must therefore have a 
probability of logical error no more than epout/(l + ^) = 
5 X 10^1^. Given V = 192 for 15-1 state distillation, this 
means we need VPL{d,Pg) = 192PL(d, 10"'') < 5x10-1^, 
implying d — 19. The states input to the top level 
of distillation must have an error rate no more than 
p = \/pout/35(l + e) = 2.4 X 10"*^. Since this is less 
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TABLE I: Minimum achieved volumes in qubits-rounds for 
all combinations of pin and pout of interest when using con- 
catenated 15-1 state distiUation. The approximate two or- 
ders of magnitude volume ratio of pin = IQ-^ and 10~* for 
Pout = 10-2" is due to the former requiring three levels of dis- 
tillation of distance 13, 21 and 45 respectively, whereas the 
latter requires just two levels of distance 7 and 15 respectively. 
This is directly related to the assumption that the gate error 
rate pg is pin/10, meaning much smaller distances, and hence 
volumes, are required to achieve a given reliability. Bold num- 
bers indicate a transition to more levels of distillation. For 
Pin = 10-2, two levels are required even for pout ~ W~^, with 
a transition to three levels at pout = IQ-^^, Por lower pin, only 
one or two levels are required. Italicized entries are smaller 
than their corresponding entries in Table [II] and Table [Till 

than Pin, more state distillation is required. Our sec- 
ond level of state distillation must have a probability 
of logical error no more than ep/{l + e) = 1.2 x 10^^, 
implying d ~ 9. The states input to the second level 
of distillation must have an error rate no more than 
^2.4 x 10-6/35(1 -he) = 3.3 x IQ-^. Since this is greater 
than Pin, no further distillation is required. The absolute 
volume of the d = 19 top level and 15 c? = 9 second level 
distillation structures is 3.1 x 10^ qubits-rounds. 

In practice, the computation of the previous paragraph 
is performed for a range of values of e, and the value 
leading to minimum volume chosen. Table |l] contains the 
minimum volumes in qubits-rounds for the range of input 
and output error rates of interest. Our goal is to improve 
these numbers using block code state distillation. Itali- 
cized entries indicate input-output parameters for which 
block code state distillation failed to reduce the overhead. 

Given values of and Pout, we can choose an arbi- 
trary value of k and e for a top level of block code state 
distillation, and calculate the required block input er- 
ror rate pk — v^Pout/(3A: + 1)(1 -I- e). Concatenated 15-1 
distillation will then be used to reduce pin to pk ■ The geo- 
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TABLE IL Minimum achieved volumes in qubits-rounds for 
all combinations of pin and Pout of interest when using a top 
level of block code state distillation followed by concatenated 
15-1 state distillation. Bold numbers indicate a transition to 
more levels of distillation. For pin = IO-2, two levels, one 
block and one 15-1, are required even for pout ~ 10-^, with 
a transition to two levels of 15-1 at pout ~ . For lower 
Pin, initially no 15-1 distillation is required. Italicized entries 
are smaller than their corresponding entries in Table |T] and 
Table imi 



metric volume of block code state distillation is 96A: -1-216. 
We must therefore choose a top level code distance 
sufhciently large to satisfy (96fc -I- 216)PL(d,pin/10) < 
ePout/(l + e)- Given the absolute volume Vjj of the block 
code used, and the absolute volume Vi^ of each 15-1 con- 
catenated structure used to produce an input to the block 
code stage, the total absolute volume assigned to each 
output win be (H + (3fc + ?>)Vi^)/k. 

The minimum absolute volume found for arbitrary k 
and e is shown in Table |lll Italicized volumes are lower 
than the corresponding concatenated 15-1 volumes (and 
two-level block code distilled volumes to be discussed 
shortly). In all cases, the volume reduction is less than a 
factor of three and was typically a factor of two for the 
cases in which a reduction was observed at all. Note that 
a reduction is observed when concatenated 15-1 distilla- 
tion needs an additional level (bold entries in Table |l| . 
This makes sense, as when just a little more distillation 
is required, it is better to use the lower overhead block 
code approach. 

Continuing similarly, wc constructed Table |III| assum- 
ing two top levels of block code state distillation. We 
found the minimum volume varying e, fci and /c2, where 
ki and /c2 are the k values of the first and second lay- 
ers of block distillation, respectively. Where further im- 
provement was observed, this was typically quite modest. 
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TABLE III: Minimum achieved volumes in qubits-rounds for 
all combinations of Pin and Pout of interest when using two top 
levels of block code state distillation followed by concatenated 
15-1 state distillation. Bold numbers indicate a transition 
to more levels of distillation. For all values of pin, the first 
entry corresponds to no 15-1 distillation. Italicized entries 
are smaller than their corresponding entries in Table [T] and 
Table im 

usually less than a factor of two. 

III. DISCUSSION 

We have presented an explicit extendable topological 
structure corresponding to computation in the surface 



code that implements the block code state distillation 
procedure of [23] . Every effort was made to make this 
topological structure as compact as possible using avail- 
able techniques Despite this, we found only a mod- 
est overhead reduction, on average a factor of two to 
three, when using block code state distillation for favor- 
able parameters. Parameter ranges were found in which 
block code state distillation lead to higher overhead. 

Two research directions will be explored to further re- 
duce the overhead of state distillation. Firstly, block 
codes of distance higher than two, secondly, more ad- 
vanced methods of compressing the complex and extend- 
able encoding circuitry of block codes. 
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Appendix A: Step-by-step compression 
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FIG. 10: Depth 31 canonical surface code implementation of Fig. [3] Dark structures are called dual defects, light structures 
are called primal defects. The depth is defined to be the maximum left to right number of small primal cubes. All figures in 
this Appendix make use of implicit bridge compression [23], meaning some of the dual defects overlap but this can be shown 
to implement the same computation. 




FIG. 11: The initial two CNOT gates can be interchanged through deformation with the long multi-target CNOT. Each of the 
primal defects has been pushed in as far as possible on both the input and output sides of the circuit, reducing the depth to 25. 
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FIG. 13: The dual rings produced by the previous move did not encircle any output qubits. Therefore, these loops can be 
commuted through the last CNOT (using Eq. 9 of |T^, namely defects of the same type commute) and removed from the 
structure. The primal junction between red and blue primal defect strands can be moved towards output, creating sufficient 
space to compress the total structure to depth 23. 




FIG. 14; The second and third from top primal defect strands can be interchanged and the dual defects associated with the 
initial two CNOT gates converted into rings using Eq. 12 of [13| . The dual rings can be removed from the structure as they do 
not involve any output qubits. 




FIG. 15: Using Eq. 12 of [TS], the first multi-target CNOT gate can be converted to a single connected primal structure and a 
large dual cage that takes the form of connected rings. 




FIG. 16: The dual cage produced by the previous move did not interacted with any output qubits. Therefore, the structure 
can be commuted all the way to output and removed from the circuit, enabling the depth to be reduced to 22. 




FIG. 17: The second layer of CNOT gates connecting red and green defect strands can be converted to a junction and the 
resulting dual ring commuted through and removed from the circuit, enabling the depth to be reduced to 20. 
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FIG. 19: The second last layer of CNOTs has been deformed to a linear chain, enabling the depth to be reduced to 18. 




FIG. 20: The third last group of CNOTs has been deformed to a linear chain. 




FIG. 22: The third group of CNOTs have been deformed to a Unear chain, enabling the depth to be reduced to 16. 




FIG. 23: The second group of CNOTs have been deformed to a linear chain, enabling the depth to be reduced to 14. 




FIG. 24: The dual defects associated with the fifth group of CNOTs have had their target braiding order reversed (Eq. 9 of 
|13|). making the output dual strand lie entirely on the upper level. 




FIG. 25: The fifth and sixth groups of CNOTs are bridged [24] together, reducing the depth to 13. 




FIG. 26: The lower level of primal defects has been pushed towards the output side of the circuit, creating additional space on 
the lower level. 




FIG. 27: The dual defects of the second group of CNOTs is rearranged closer to the vertical primal pillars to provide sufficient 
space for the next move. 




FIG. 28; The initial U structure of each output qubit is rotated 90 degrees, enabhng the depth to be reduced to 12. Note that 
this topological structure can be extended to arbitrary k. 



